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METHODS OF PRODUCING DNA AND PROTEIN LIBRARIES 

The present invention relates to methods of producing DNA libraries having 
randomised amino acid encoding codons at predetermined positions within the 
sequence and corresponding protein libraries. 

Codon randomisation is performed to generate a randomised gene library, the 
library containing multiple variations of just one gene. Randomised codons 
may be separated by conserved sequences or else may be contiguous. The 
resulting gene libraries may be expressed to generate protein libraries, which 
are subsequentiy screened to find a protein with an activity of interest The 
technique is used predominantiy in protein engineering. 

In the production of protein libraries standard randomisation techniques require 
an excess pf genes to be cloned, since randomised codons NNN (64 codons 
where N represents A, T, G or C) or NN^/t (32 codons) must be cloned to 
ensure that all 20 amino acids are represented. Thus, as the number of 
randomised codons increases, the ratio of genes to proteins produceable (i.e. a 
set in which every possible variation is represented) increases exponentially. 
Hine et ^z/.have recenfly described an alternative metiiod for producing a DNA 
library which encodes for all amino acids at two or more predetermined 
positions that involves selective hybridisation of individually synthesised 
oligonucleotides to a traditionally randomised template to circumvent this 
problem (PCT publication WO 00/15777 which reference is incorporated 
herein in its entirety). The method involves, for each predetermined positioyi, 
hybridising a pool of oligonucleotides to a region of a traditionally randoinised 
template containiiig that predetermined positiLon. Any given arpino acid (at the 
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predetermined position) is only encoded for once in each oligonucleotide pool. 
The technique is called "MAX" randomisation, and the codons chosen for the 
pygonucleotide probes are known as MAX codons. The benefit of the 
technique is that as the number of randomised codon positions increases, the 
ratio of genes to proteins produceable remains constant. Although an 
improvement over traditional methods, since each gene encodes for a unique 
protein, this method results in a relatively high numba: (-10%) of non-MAX 
(i.e. undesirable) codons at the randomised amino acid encoding positions. Li 
addition, very small quantities of DNA containing the differing combinations 
of selected codons are produced making subsequent manipulations technically 
difficult. 

It is an object of the present invention to obviate or mitigate one or more of the 
known problems by providing an improved metiiod of producing DNA libraries 
encoding all possible amino acids at predetermined positions. 

According to a first aspect of the present invention there is provided a method 
of producing a DNA library comprising a plurality of DNA sequences of 
interest, each DNA sequence of interest having at least two predetermined 
positions, with at each predetermined position a codon selected from a defined 
group for that position, the codons within a group coding for different amino 
acids, said method comprising the steps of: - 

(i) contacting so as to effect hybridisation (a) template DNA comprising 
said at least two predetermined positions, said template DNA being fully 
randomised at said at least two predetermined positions, (b) for each 
predetermined position, a selection oligonucleotide pool, each selection 
oligonucleotide, witinn each pool comprising a codon selected from: die defined 



group for that predetermined position, and (c) at least one additional 
oligonucleotide sequence comprising a region which is non-hybridisable to the 
template DNA, 

(ii) ligating the hybridised DNA sequences, 

(iii) denaturing the product of step (ii) so as to give a mixed population of 
said template DNA and said DNA sequences of interest, and 

(iv) selectively amplifying the DNA sequences of interest, 

wherein said additional oligonucleotide sequence of step (i) is selected such 
that-after step (ii) the non-hybridisable region is located externally of (i.e. 
"overhangs") the template DNA. 

Prom the foregoing, it will be understood that each defined group may consist 
of up to but no more than 20 codons. 

It will be understood that the term "predetermined position" as used herein 
refers to a specific codon position within the DNA sequence of interest and 
also to the corresponding codon position within die complementary template 
DNA. 

It will be further understood that the term "template DNA" refers to a 
population of DNA sequences differing only at the predetermined positions, 
where the codon sequence is fully randomised (i.e. all possible trinucleotide 
combinations are represented at those positions). The DNA sequences may be 
a gene sequence or a partial gene sequence. 



Preferably, said defined group consists of the codons: 



AAA, AAC, ACC, AGC. ATG, ATT, CAG. CAT, CCG, CGC, CTG, GAA, 
GAT, GCG, GGC, GTG, TAT,.TGG, TGC, TTT. 



Hereinafter, these codons will be referred to as "MAX" codons. The MAX 
codons have been chosen since they represent the optimum codon usage for 
each amino acid in the model organism Escherichia coli. It will be readily 
apparent that, if desired, any of the MAX codons may be substituted for an 
alternative codon coding for the same amino acid. It may be desirable to 
substitute codons due to differing optimum codon usage in different organisms. 

In particular, one or more of the defined groups may contain codons encoding 
for less than 20 amino acids. Thus, for each predetermined position, the 
defined groups may be the same or differoit. In some circumstances it may be 
desirable for a defined group to encode for less than 20 amino acids, for 
example if a particular amino acid or type of amino acid (e.g. basic, polar or 
non polar) is required at a particular predetermined position in the expressed 
protein. 

Said additional oligonucleotide sequence may form part of the oligonucleotides 
in one of the selection pools. It will be understood that for the non- 
hybrldisable region of the additional sequence to be located externally of the 
template DNA after step (ii), the additional sequence, must be located towards 
an end (which must be the 3' end for subsequent amplification) of the newly 
fotmed strand relative to the predetermined positions (i.e. the additional 
sequence cannot be between two predetermined positions). 
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Preferably, however, said additional oligonucleotide-sequence is a separate 
oligonucleotide having a region complementary to the 5* end of the template 
DNA. 

Preferably, in step (i) each selection oligonucleotide pool is added in excess of 
that requked to hybridise with template DNA (useable template DNA) where 
NNN of the relevant predetermined position is complementary to the MAX 
codons. Preferably, the ratio of each selection oHgonucleotide pool to useable 
template DNA is at least 2:1, more preferably at least 5:1, even more preferably 
at least 10: 1, and most preferably about 12: 1 . 

In a first series of embodiments/the template DNA is attached to a support 
(e.g. polymeric bead) prior to step (i) such that after the denaturation 
(separation) of the double stranded DNA construct in step (ii), tiie template 
DNA is removed, for example by centrifugation or magnetism, before step (iv). 
Step (iv) is then effected by PGR utiUsmg flie overhanging non-hybridisable 
region of the additional sequence as a primer binding site (hence tiie 
requkement for it to be at the 3' end of the sequence of mterest). 

In a second series of embodiments, the method includes contacting a second 
additional oligonucleotide sequence in step (i). This second additional 
oligonucleotide also comprises a non-hybridisable region, the second additional 
sequence being designed such that after step (ii) it is located at tiie 5' end of tiie 
sequence of interest, with tiie non-hybridisable region overhangmg tiie 3' end 
of tiie template DNA. As witii tiie first additional sequence, tiie second 
additional sequence may form part of the oligonucleotides in one of the . 
sel^tion pools, or it may be a separate oHgonucleotide. During step (iv) a first 



primer complementary to the non-hybridisable region of the first additional 
sequence, and a second primer identical to the non-hybridisable region of the 
second additional sequence are used. It will be readily apparent to the skilled 
person that the first primer will bind to the sequence of interest at its 3' end 
initiating synthesis of a complementary strand. The second primer will then 
hybridise to the complementary strand (at its 3' end) thereby initiating 
synthesis of the sequence of interest. The primers will not bind the template 
DNA which wiU therefore not be amplified. As a result it.is not necessary to 
remove the template DNA prior to step (iv). 

Preferably, the amplified DNA sequences of interest are inserted after step (iv) 
into a suitable cloning vector. The cloning vector may be any type of 
prokaryotic or eukaryotic cloning vector such as an expression vector, an 
integrating vector or a bacteriophage vector and is chosen according to the 
intended use of the library. 

Preferably, prior to insertion into the cloning vector, the DNA sequences are 
digested by a restriction endonuclease in order to generate the required cassette 
for cloning. For this purpose, a restriction endonuclease recognition site is 
present in the required location in the sequences of interest. The recognition 
site is preferably provided in the initial template DNA. Preferably, said 
restriction endonuclease recognition site is a unique site within the DNA 
sequence. 

The sequences of interest, which wiU not generaUy be fiill gene sequences, may 
be inserted into an appropriate gene. The gene insertion step may be effected 
prior to or concomitantly y^ith insertion into an appropriate cloning vector. 
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Preferably, the cloning vectors containing DNA sequences of interest are 
transformed into suitable host cells by any suitable method for example by heat 
shock, electroporation or by bacteriophage infection, after suitable packaging 
of a bacteriophage vector. 

The present invention further resides in a DNA library produceable by the 
method of the first aspect 

According to a second aspect of the present invention there is provided a 
method of producing a protein library comprising a pluraHty of polypeptides, 
each polypeptide having a different combination of amino acid residues in at 
least two predetermined positions, said method conaprising the step of 
expressing the sequences of interest produced by the method of the first aspect. 

It will be understood that the population of polypeptides produced have MAX 
encoded amino acid residues at positions corresponding to. the predetermined 
positions in the DNA sequence of interest 

The present invention further resides in a protem Ubrary produceable by the 
method of the second aspect 

The present invention still further resides in the use of said protein library to 
investigate binding interactions between the proteins (polypeptides) in the 
library and any appropriate ligand such as DNA, and other proteins or ligands. 
For example, said protein Ubrary can be used to investigate the binding 
interactions of randomised zinc fingers or randomised antibodies. 



Bnbodiments of the invention will now be described, by way of example only, 
with reference to the accompanying diagrams in which: 

Fig. 1 shows schematically a method of producing DNA sequences containing 
MAX codons according to a comparative example, 
Hg. 2 shows the distrilnition of MAX codons and non-MAX codons at the 
predetCTmined positions within a DNA sequence produced by the method of 
the comparative example. 

Fig. 3 shows schematically a method of producing DNA sequences containing 
MAX codons according to a first embodiment of the present invention, ; 
Fig. 4 shows the distribution of MAX codons and non-MAX codons at the 
predetCTmined positions within a DNA sequence produced by the method of 
the first embodiment of the present invention, 

Fig. 5 shows schematically a method of producing DNA sequences containing 
MAX codons according to a second embodiment of the present invention. 
Fig, 6 shows the distribution of MAX codons and non-MAX codons at the 
predetCTmined positions witiiin a DNA sequence produced by the method of 
the second embodirnent of the present invention having a ratio of selection 
oligonucleotide : useful template DNA of about 1:1, and 
Fig. 7 shows the distribution of MAX codons and non-MAX codons at the 
predetermined positions within a DNA sequence produced by the method of 
the second embodiment of the present invention having a ratio of selection 
oligonucleotide : useful template DNA of about 12:1. 

PRODUCTION OF DNA LIBRARIES 



1. COMPARATIVE EXAMPLE 

Figure 1 shows schematically a method of producing a randomised DNA 
library containing MAX codons at three specified positions according to a 
comparative example. In figure 1, 'TST' denotes the presence of any nucleotide, 
whereas MAX denotes a codon, each MAX codon being one of the group of 20 
codons consisting of: - 

AAA, AAC. ACC, AGC, ATG. ATT, CAG. CAT, CCG. CGC, CTG, GAA, 
GAT, GCG, GGC, GTG. TAT, TGG. TGC, TTT. 

Each of the above MAX codons codes for a different one of the 20 amino 
acids. 

The main stages involved in the production of the library are: - 

1 . mixing the template DNA (A) randomised at Ihe predetermined 
positions, selection oligonucleotides (B) and an additional oUgonucleotide (C) 
complementary to the 5' end of the template DNA, 

2. effecting hybridisation of tiie oligonucleotides to template DNA 
sequences which have codons complementary to the MAX codons at the 
jnredetemiined positions, 

3. ligating the hybridised sequences, and 

4. inserting the double stiranded DNA constructs into an appropriate vector. 

The template DNA comprises a plurality of sequences which are identical other 
than at the predeteraiined positions (denoted by '"N" in the tenq>late DNA). 
Selection oligonucleotides will not tend to hybridise at the predetermined 
positions to those template strands which do not have a sequence 



complementary to one of the MAX codons at any of these positions. It will be 
noted that in the comparative example shown, the template DNA extends in the 
5' direction beyond the endmost predetermined position. The additional 
oligonucleotide is complementary to this 5' end region and its purpose is to 
ensure that double stranded DNA is formed for the required length of the 
template DNA. 

Hybridisation, ligation and cloning were performed as described below and the 
cloned DNA constructs transformed into E. coli DH5a (genotype: F' 
80dlacZ(lacZYA-argF)U169 deoRrecAl endAl hsdR17(rK-, mK+)phoA 
supE44 - thi-1 gyrA96 relAl/F' proAB+ lacIqZMlS TnlO(tetr)) chemically 
competent cells, which were induced to take up DNA by heat shock. Clones 
were picked and plasmid DNA preparations undertaken. The inserts were then 
sequenced to identify the sequences of the codons present at the predetermined 
positions. 

Materials and Methods 
Template DNA productioTi 

Template DNA was synthesised by MWG Biotech. At the three predetermined 
codon positions, i.e. the sites of randomisation, the nucleotide sequence NNN 
(where N represents any nucleotide) was specified. This results in a population 
of polynucleotide sequences in which all possible combinations of nucleotides 
are represented at the predetermined positions. 



Selection ohgonucleotide production 



Selection oUgonucleotides were synthesised by MWG Biotech. Selection 
oligonucleotides were designed so as to be complementary to contiguous 
regions of the template DNA, with each selection oligonucleotide containing 
one of the predetermined positions at its 3* end. The selection oUgonucleotides 
were synthesised in groups of 20 (one group or pool for each predetermined 
position) with each member of a group containing a different MAX codon. A 
set of three selection oligonucleotide pools were thus produced with each pool 
having all 20 MAX codons represented. 

A further oUgonucleotide was also synthesised. This further oligonucleotide 
being complementary to flie template DNA from its 5' end up to the nearest 
predetermined position, such tiiat oUgonucleotides complementary to the fuU 
length of the template DNA wdre present. 

Phosphorylation 

5' Phosphorylation of appropriate selection oligonucleotide pools was 
performed by die addition of Polynucleotide Kinase (New England Biolabs) 
and ATP to tiie oUgonucleotides suspended in PNK buffer (New England 
Biolabs) as per the manufacture's instructions. 

Hybridisation. 

5 or 10 pmol of each selection oUgonucleotide for each predetermined position 
(i.e. 100 or 200 pmol of oUgonucleotides for each predetermined position) was 
mixed witii 320pmol template DNA and 320pmol of flie further oUgonucleotide 
in a total volume of 50\sl hybridisation buffer (50mM Tris-HCL pH 7.6. lOmM 
MgCl2, 4%w/y PEG8000 (GIBCO)) to give a selection oUgonucleotide : 
. complementary MAX-containing C*useful") teni5)late DNA ratio of -1:1 or 2:1. 



The mix was heated to 95''C for 3 minutes then cooled at a rate of rC/min to 
26**C to allow the complementary DNA sequences to hybridise. Figure 2 
shows the distribution of the different amino acid encoding codons from the 
combined results of these experiments. 

Ligation 

After hybridisation, 1 Weiss unit of ligase (Invitrogen), ATP to 2mM and DTT 
to ImM were added to the hybridisation mix. This mix was incubated at le^'C 
for 16 hours to allow the hybridised selection oligonucleotides to ligate. 

Phenol Chloroform extraction of DNA 

The protein and DNA sequences were separated using phenol chloroform - 
extraction. An equal volume of DNA suspension, phenol (pH8) and 24:1 
chloroform:iso-amyl alcohol were mixed vigorously and allowed to separate, 
the aqueous upper phase was carefully removed and a further extraction 
imdertaken. A final chloroform extraction was undertaken to remove any 
traces of phenol from the DNA suspension. The DNA was then precipitated in 
ice-cold ethanol and resuspended in an appropriate volume of water. 

Cloning 

For gene randomisation, Plasmid pGST-ZFHMA3 was derived from plasmid 
pGST-ZFH, which encodes a glutathione ^-transferase/zinc finger fusion 
protein. Briefly, a 37 bp cassette, encompassing the three codons to be 
randomised, was excised from pGST-ZEH by combined HindHUBsiWI 
digestion. The cassette was then replaced with a 20bp oligonucleotide cassfctte 
that contained a central Smal restriction site. The latter 20 bp cassette changes 
the reading frame of the remainder of the gene, and so ensures that no 
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functional zinc finger protein is encoded, unless a randomised, 37bp cassette is 
inserted successfully. 

In preparation for cloning, plasmid pGST-ZFHMA3 was digested with 5mal, 
Hindm and BsiWl. Combined Hindm/BsiWl digestion generates sticky ends 
complementary to tiiose of tiie randomised cassette. Upon successful insertion 
of a randomised cassette, tiie original coding sequence of plasmid pGST-ZFH 
is restored, except at the randomised codons. the purpose of the Smal digest 
■ (which generates blunt ends) is to cut the 20 bp cassette and so minimise any 
re-insertion. Note tiiat tiie plasmid should not re-circularise in tiie absence of 
insert DNA, since Hindm and BsiWl do not produce complementary sticky 
ends. 

Randomised cassettes (10 pmol total) were Ugated at 16°C, overnight, into 
lOOng of plasmid pGST-ZFHMA3 which had been pre-digested witii Smal, 
Hindin and Bsim, under tiie ligation conditions described above. The 
Ugations were transformed into chemically competent K coli DH5a cells. 

Preparation of chemica lly competent cells 

SOB medium (10 ml) was inoculated witii a single colony and tiie resulting 
culture incubated witii shaking at 37"C overnight. The culture (8 ml) was 
inoculated into 800 ml SOB medium and the resulting culture incubated at 
37'C until an OD550 of -0.45 was reached. The cells were chilled on ice for 30 
mins and pelleted by centiifugation. The supernatant was removed by 
inversion and tiie pellet resuspended in 264 ml of RFl buffer (lOOmM RbCl, 
50mM MnCl2. 30mM potassium acetate, lOmM CaCh. 15 % glycerol, adjusted 
to pH 5.8 witii 0.2M acetic acid). The cells were incubated on ice for 60 mins, 
peUeted, resuspended in 64 ml RF2 buffer (10 M MOPS (4- 
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morpholinepropanesulfomc acid), lOmM RbCl, 75mM CaCla. 15% glycerol, 
adjusted to pH 6.8 with NaOH) and incubated on ice for 15 mins. They were 
then dispensed into 200 fil ali<iuots in microfuge tubes, flash frozen in liquid 
nitrogen, and stored at -70 °C until required. 

Transformation 

Vectors were transformed into chemically competent cells by heat shock. An 
aliquot of chenucally competent cells was thawed on ice, the DNA added and 
the mixture incubated on ice for 30 mins. The cells were heat shocked at 37"C 
for 45 s and returned to ice for 2 mins. LB (800 /il) was added to each tube and 
the cells were incubated at 37°C for 60 mins, with moderate agitation. The 
cells were plated onto selective medium. 

Plasmid DNA preparation 

Plasmid preparations were either made by Wizard mini-prep (Promega), or 
else, in high throughput format, by Birmingham Genomics lab. 

DNA sequencing 

DNA sequencing was performed by Birmingham Genonaics lab on an ABI 
3700 sequencer. - 

RESULTS 

1 Comparatiye Example 

Figure 2 shows the distribution of the different amino acid encoding MAX 
codon£ at the predetsnnraed positioiis in clones idendiied as coniaining a rvL4X 



encoding DNA sequence. A total of 27 clones were sequenced, giving 81 
MAX encoding positions. Figure 2 shows that this method of library 
production gives a reasonable distribution of MAX codons, the different 
codons being present at the three predetermined positions with a frequency of 
between 0 and about 10%, compared to the ideal distribution of 5% of each 
MAXcodon. No phenylalanine (colimm F) encoding MAX codons were 
identified in this experiment, which may be due to degradation of the selection 
oligonucleotide or due to the relatively small sample size. Ideally there should 
be no non-MAX codons present at the predetermined positions. In the method 
according to the comparative example non-MAX codons (column X) occur 
with a frequency of about 9%. It is thought that non-MAX codons occur due to 
incorrect annealing of the template DNA and one or more of the selection 
oligonucleotides leading to mismatches. If the mismatches were tolerated 
during ligation, the host cell would randomly correct these to either the 
template sequence or the MAX sequence so that non-MAX codons could be 
fixed in some clones leading to a skewing of the distribution. 

2. Example 1 

Figure 3 shows schematically a method of producing randomised DNA 
libraries containing MAX codons at three specified positions according to a 
first embodiment of the present invention. 

The main stages involved in the production of the Woraxy are: - 

1 . mixing template DNA (A) (on a solid support (D)) randomised at the 

predetermined positions, selection oligonucleotides (B) and an additional 
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oligonucleotide (E) having a first region (Ei) complementary to the 5' end of 
the template DNA and a second non-hybridisable region (E2), 

2. effecting hybridisation of the oligonucleotides to template DNA 
sequences having codons con^lementary to the MAX codons at the 
predetermined positions, . 

3. ligating the hybridised sequences, 

4. denaturing thie double stranded DNA constructs, 

5. removing the tsmp\at& DNA by centrifiigation, 

6. amplifying by PGR the MAX codon contaming strand, 

7. restriction digesting using an endonuclease to remove the non-required 
region of the resulting DNA cassette, and 

8. cloning the double stranded DNA constructs into an appropriate vector. 
Materials and Methods. 

DNA sequence production. 

Template DNA was syntiiesised onto Oligo-Affinity Support Polystyrene 
(OASPS) beads (Glen Research) on a Beckman Oligo 1000 DNA synthesiser. 
Selection oligonucleotides ware synthesised as described for the comparative 
example above. 

An additional oligonucleotide complementary to a region of the template DNA 
from its 5' end to the nearest predetermined position is also synthesised. This 
oligonucleotide is extended in its 3' direction such that it extends beyond (i.e. 
overhangs) the template DNA. The extended region is non-complementary 
with the template DNA (and therefore will not hybridise) and serves as a 



binding site for a PGR primer so ensuring that only the MAX-codon containing 
strand is amplified. 



Phosphorylation, hybridisation and Hgation were performed as described for 
the comparative example. 

TftTn plate DNA removal. 

After the Hgation step, the mix was heated to 95*0 for 5 mins to denature the 
duplex DNA, the mix was centrifuged at 14000 rpm f or 1 min (Eppendorf 
microfuge) to remove the template DNA strands attached to the solid support 
leaving the newly Ugated MAX encoding DNA sequences in the supernatant 

PGR. 

PGR reactions were performed in a thermal cycler (MJ Engme, model PTG200) 
typically in a reaction volume of lOOjjl. Ipl of supernatant contaming liie 
single stranded MAX encoding DNA sequences was added to a PGR reaction 
mix (200pM dNTPs, SOpM primers, Pfii DNA polymerase (Promega), lOpl 
lOx PGR reaction buffer (Pfu buffer (Promega)) made up to lOOpl with double 
distilled H2O). One primer was designed so as to be complementary to the 
extended region at the 3' fend of the MAX encoding DNA sequences, and a 
second to be complementary to the 3' end of the template DNA sequence. 
Even after tenqilate DNA removal, some template DNA may remain. In 
practice smaU amounts of template DNA in the PGR reaction mix does not 
adversely effect the distribution of MAX-codons. The template DNA is not 
exponentially amplified as it only contains one of the primer binding sites and 
so will effectively be diluted out. The reaction mix was heated to 95''C for 2 
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min then 35 cycles of 94*'C 30s, 48°C Imin, and 72*0 30s were performed 
before cooling to 4**C. 

Restriction endonuclease digestion. 

Restriction enzymes, NEBuffer 3. and Calf Intestinal Alkaline Phosphatase 
were obtained from New England Biolabs. Two PGR reactions were combined 
(200 jtil), a 2pjtil aliquot removed for examination and the remainder extracted 
with phenol/chloroform. The DNA was resuspended in 88^11 H2O, lO/il 
NEBuffer 3 (New England Biolabs) and 20 units Hindm. The digestion was 
incubated at 37°C for 2 hrs and another 10 (il aliquot removed. BsiWl (20 
units) was then added and the digest incubated at 55'C for 16 hrs. Calf 
Intestinal Alkaline Phosphatase (10 units) was then added and the reaction 
incubated at 37°C for 2 hrs. The resulting digest was extracted with 
phenol/chloroform and resuspended in 40jLtl H2O. 

Subsequent steps wore carried out in the same manner as for the comparative 
example. 

The sequences of the template DNA, selection oligonucleotides and the 5' and 
3' primer sequences were: - 



iSACTGAAGCTTTAG'ri 

GACTGAAGCTTTAGmCO CRGCGACMAX rrACAftftgLTCATCAGCGTACGACG TCAG^ 
CTGACTTCGAAATCAMMMTCGCTGNMNAATGTTNMMGTAGTCGC^ 



XXXXl PGR primers 

MAX 1 St position MAX selection oligonucleotide 

XiLA 2nd position-MAX selection oligonucleotide 

XXX~ 3fd positionlviAr; selection oEsonucleotide. 



( 
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NNN siteof landomisatioii ■ 

RESULTS 

Figure 4 shows the distribution of the different MAX codons at the 
predetermined positions in clones identified as containing a MAX encoding 
DNA sequence. A total of 84 clones were sequenced giving 252 MAX 
encoding positions. Figure 4 shows lhat this method of Hhrary production 
.gives greatly reduced numbers of non-MAX codons. with their frequency 
reduced to below 1% (column X) as compared to about 9% in the Ubrary 
produced according to the method of the comparative example (Fig. 2, column 
X). This means that a DNA Ubrary containing khown MAX sequences at the 
predetermined positions can be produced with a high degree of certainty, by 
controlling which MAX codon containing oligonucleotides are included in the 
selection pool. 

The distribution of the different MAX codons. however, is poor compared to 
the ideal 5% incidence, varying from no serine encoding triplets (column S) to 
over 15% phenylalanine and tryptophan (columns F and W respectively). It is 
thought that the uneven representation of the various MAX codons may be due 
to unequal concentrations within the template oHgonucleotide. 

3. Examples 2a and 2b 

Figure 5 shows schematically a method of producing a randomised DNA 
library containing MAX codons at three specified positions according a second 
embodiment of the present invention the method being similar to that of 
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Example 1. Unlike Example 1. the template DNA is not synthesised on a bead 
and its removal prior to PGR is not necessary for reasons which wiU be 
ej^lained below. 

The most important difference between Example 1 and Example 2 is that the 
selection oUgonucleotides (F) for the predetermined position nearest the 3' end 
of the template DNA are extended at their 5' end. The extension is non- 
hybridisable with and "overhangs" the template DNA. The 5' extension is 
designed such that after the first round of PGR, (he 3* end of the newly formed 
strand (which is complementary to the 5* extension) serves as the second ' 
primer bindmg site. Since neither printer wiU hybridise with the template 
DNA, only the required sequences are amplified, again, the restriction sites are 
within the template oligonucleotide. 

In Exanq)le 2a, the ratio of selection oUgonucleotides to template DNA and 
additional oKgonucleotide was the same as for Example 1, being about 1:1 
selection oUgonucleotide : useful template DNA. In Example 2b, the ratio of 
selection oHgonucleotides to template DNA and additional oKgonucleotide was 
greater (about 40pmol of each selection oligonucleotide to 210pmol of template 
DNA and additional oUgonucleotide) being about 12:1 selection 
oUgonucleotide : useful template DNA. 

The sequences of the template DNA, selection oUgonucleotides and the 5' and 
3' extended sequences were: - 

iGftCTGAAGCTTTAClt 



xxxx 



PGR primers 



XXX 



MAX 



NNN 



XXX 



1st position MAX selection oligonucleotide 
2nd position MAX selection oligonucleotide 
3rd position MAX selection oligonucleotide 
site of randomisation 



Figures 6 and 7 shim the distribution of the different MAX codons at tiie 



DNA sequences produced from hybridisation mixes having selection 
oligonucleotide : useful template DNA ratios of 1 :1 (Example 2a) and 12:1 



In Example 2a, a total of 40 clones were sequenced giving 120 MAX encoding 
positions. Rgure 6 shows that this method of library production gives reduced 
numbers of non-MAX codons, with tiieir frequency reduced to about 2% 
(column X and colmnn * the latter designating a stop codon) as compared to 
about 9% in the library produced according to the metiiod of the comparative 
example Qc^ig. 2, column X). However, the distribution of MAX codons is poor 
with large numbers of alanine, glutamic acid and tryptophan (columns A, E and 
W respectively) encoding codons present and no or very few leucine, 
glutamine, argmine or serine (columns (L, Q, R and S respectively) encoding 



In Example 2b, a total of 37 clones were sequenced giving 111 MAX encoding 
positions. Figure 7 shows that this method of library production gives reduced 
numbers of non-MAX codons, with their frequency reduced to below 4% 
(colunm X) as compared to about 9% in the library produced according to the 



predetermined positions in clones identified as containing MAX encoding 



(Example 2b) respectively. 



. codons. 



method of the comparative example (Hg. 2, column X). but higher numbers of 
non-MAX codons compared with the method of Example 1. However, the 
distribution of MAX codons encoding is better than for Example 1. The use of 
a large excess of selection oligonucleotides may improve'the distribution of 
MAX codoas by minimising the negative effect of any possible template DNA 
bias. 

A comparison of figures 6 and 7 shows that increasing the ratio of selection 
oUgonucleotide sequences : useful template DNA greatly improves the 
distribution of MAX-codons present at the positions of interest. Although the 
number of non-MAX codons present increases slightly, this level is still below 
that seen in the comparative example. 

4. Example 3 

When the complementary region between the overhang-containing 
oHgonucieotide and the template DNA at its 3' end is short and a MAX codon 
is located within the hybridismg region oif that oUgonucleotide, the above 
method of Ubrary production may lead to a residual bias toward G/C rich MAX 
codons at that position due to the higher bond strength of G/C bonds compared 
with A/T bonds. To attempt to eliminate this bias, the template DNA has been 
extended at is 3' end relative to that shown for Example 2 (the extended region 
being removed by a restriction endonuclease prior to cloning) and the relevant 
selection oligonucleotide divided into a constant sequence and a shorter 
selection oUgonucleotide. This modification should prevent any G/C bias at 
that position of randomisation. New template DNA and new PCR primers 
having the sequences shown below have been ssaithersised andnsed to produce 
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aDNA sequence library. It will be seen from the sequence below that the 3' 
end of thfc template DNA has been extended by six bases beyond the end of the 
selection oligonucleotide at the 3* end of the template DNA. If this overlap 
region is too long, for example 18 bases, then the second additional sequence 
can bind to the template DNA during PGR and act as a primer leading to 
unwanted amplification of tihe template DNA. 



/^^wiv^raRa&'iv^&MKiKPiv-vaf-rfnjMM&ATCyT^^ 



XXXX PGR primers 

MAX 1st position MAX selection oligonucleotide 

XXX 2nd position MAX selection oligonucleotide 

XXX 3rd position MAX selection oligonucleotide 

NNN site of randomisation 

Using the above embodiments to produce DNA sequence libraries having 
predetermined positions of randomisation also allows a number of consecutive 
codons to be randomised using trinucleotides as the selection oligonucleotide 
pools to hybridise to ttie randomised positions. This was not feasible using the 
method according to the comparative example due to potential ndsalignments 
leading to frameshift mutations. 
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